'\" t
.\"
.\" Copyright (c) 2014-2018 Red Hat.
.\"
.\" This program is free software; you can redistribute it and/or modify it
.\" under the terms of the GNU General Public License as published by the
.\" Free Software Foundation; either version 2 of the License, or (at your
.\" option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful, but
.\" WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
.\" or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
.\" for more details.
.\"
.TH LOGARCHIVE 5 "" "Performance Co-Pilot"
.SH NAME
\f3LOGARCHIVE\f1 \- Performance Co-Pilot archive formats
.de TW	\" wrap a table row
.ad l
.hy 0
..
.SH DESCRIPTION
Performance Co-Pilot (PCP) archives store historical values
about arbitrary metrics recorded from a single host.
Archives are machine independent and self-contained \- all
metric data and metadata required for off-line or off-site analysis is held
within an archive.
.PP
The format is stable in order to allow long-term historical
storage and processing by
.BR PMAPI (3)
client tools.
However some format variants are supported over time, and
currently Versions 2 and 3 are supported.
The mandate is that PCP will provide
long-term backwards compatibility, so an archive created
on any version of PCP can be read on that version of PCP and
.B all
subsequent versions of PCP.
The exception is Version 1 that was retired in the PCP Version 2.0
release in May 1998.
.PP
Archives may be read by most PCP client tools, using the
.IR "\-a/\-\-archive NAME"
option, or dumped raw by
.BR pmlogdump (1).
Archives are created primarily by
.BR pmlogger (1),
however they can also be created using the
.BR LOGIMPORT (3)
programming interface.
.PP
Archives may be merged, analyzed, modified and subsampled using
.BR pmlogreduce (1),
.BR pmlogsummary (1),
.BR pmlogrewrite (1)
and
.BR pmlogextract (1).
In addition, PCP archives may be examined in sets or grouped
together into ``archive folios'', which are created and managed
by the
.BR mkaf (1)
and
.BR pmafm (1)
tools.
.PP
An archive consists of several physical files that share a common
arbitrary prefix, e.g.
.IR myarchive .
.TP
\f2myarchive\f1.0, \f2myarchive\f1.1, ...
One or more data volumes containing the metric values and any
error codes encountered during metric sampling.
Typically the largest of the files and may grow very rapidly,
depending on the selection of metrics to be logged by
.BR pmlogger (1)
and the sampling intervals being used.
.TP
.IR myarchive .meta
Information for PMAPI functions such as
.BR pmLookupName (3),
.BR pmLookupDesc (3),
.BR pmLookupLabels (3)
and
.BR pmLookupInDom (3).
The metadata file may grow sporadically as logged metrics,
instance domains and labels vary over time.
.TP
.IR myarchive .index
A temporal index, mapping timestamps to byte offsets in the other files.
.SH COMMON FEATURES
All three types of files have a similar record-based structure, a
convention of network byte-order (big-endian) encoding, and 32-bit
fields for tagging/padding for those records.
Strings are stored as 8-bit characters without assuming a specific
encoding, so normally ASCII.
See also the
.BR __pmLog*
types in
.IR src/include/pcp/libpcp.h .

.SS RECORD FRAMING
The volume and .meta files are divided into self-identifying records.
.TS
box,center;
c | c | c
r | r | lx.
Offset	Length	Name
_
0	4	T{
.TW
N, length of record, in bytes, including this field
T}
4	N-8	T{
.TW
record payload, usually starting with a 32-bit record type tag
T}
N-4	4	N, length of record (again)
.TE

.SS ARCHIVE LABEL
All three types of files begin with an ``archive label'' header, which
identifies the host name, starting timestamp and timezone information;
all referring to the host that was the source of the performance data
(which may be different to the host where
.BR pmlogger (1)
was running).
.PP
The ``archive label'' format differs between Version 2 and Version 3,
with the latter providing enhanced timestamps
(64-bit encoding of the seconds part and nanosecond precision)
and some additional fields.

.TS
box,center;
c   s   s
r | r | c
r | r | lx.
Version 2
_
Offset	Length	Name
_
0	4	tag, PM_LOG_MAGIC | PM_LOG_VERS02=0x50052602
4	4	T{
.TW
process id (PID) of pmlogger process that wrote file
T}
8	4	T{
.TW
archive start time, seconds part (past UNIX epoch)
T}
12	4	archive start time, microseconds part
16	4	T{
.TW
current archive volume number (or \-1=.meta, \-2=.index)
T}
20	64	name of collection host
80	40	T{
.TW
time zone string for collection host ($TZ environment variable)
T}
.TE

.TS
box,center;
c   s   s
r | r | c
r | r | lx.
Version 3
_
Offset	Length	Name
_
0	4	tag, PM_LOG_MAGIC | PM_LOG_VERS03=0x50052603
4	4	PID of pmlogger process that wrote file
8	8	T{
.TW
archive start time, seconds part (past UNIX epoch)
T}
16	4	archive start time, nanoseconds part
20	4	T{
.TW
current archive volume number (or \-1=.meta, \-2=.index)
T}
24	4	archive feature bits
28	4	reserved for future use
32	256	name of collection host
288	256	T{
.TW
timezone string for collection host ($TZ environment variable), e.g. AEDT-11
T}
544	256	T{
.TW
timezone zoneinfo string for collection host, e.g. :Australia/Melbourne
T}
.TE

.PP
The ``archive feature bits'' are intended to encode possible
future extensions or differences
to the on-disk structure or the the archive semantics.
At this stage there are no such features, but if they are introduced
at some point in the future, there will be associated PM_LOG_FEATURE_XXX
macros added to
the
.I <pcp/pmapi.h>
header file.

.PP
All fields, except for the ``current archive volume number'', match for
all files in a single PCP archive.

.SH ARCHIVE VOLUME (.0, .1, ...) RECORDS
.SS pmResult
After the archive label record, an archive volume file contains
one or more records, each providing
metric values corresponding to the
.IR pmResult
from one
.BR pmFetch (3)
operation.
The record size may vary according to number of metrics being fetched and
the number of instances in the associated instance domains.
.PP
For Version 2 the file size is limited to 2GiB, due to storage of 32-bit
byte offsets within the temporal index.
For Version 3 the file size is limited to 8191PiB, due to storage of 62-bit
byte offsets within the temporal index.
.PP
The
.IR pmResult
format differs between Version 2 and Version 3,
with the latter providing enhanced timestamps
(64-bit encoding of the seconds part and nanosecond precision).

.TS
box,center;
C   s   s
c | c | c
r | r | l.
Version 2
_
Offset	Length	Name
_
0	4	timestamp, seconds part (past UNIX epoch)
4	4	timestamp, microseconds part
8	4	number of metrics with data following
12	M	pmValueSet #0
12+M	N	pmValueSet #1
12+M+N	...	...
NOP	X	pmValueBlock #0
NOP+X	Y	pmValueBlock #1
NOP+X+Y	...	...
.TE

.TS
box,center;
C   s   s
c | c | c
r | r | l.
Version 3
_
Offset	Length	Name
_
0	8	timestamp, seconds part (past UNIX epoch)
8	4	timestamp, nanoseconds part
12	4	number of metrics with data following
16	M	pmValueSet #0
16+M	N	pmValueSet #1
16+M+N	...	...
NOP	X	pmValueBlock #0
NOP+X	Y	pmValueBlock #1
NOP+X+Y	...	...
.TE

.PP
Records with a ``number of metrics'' equal to zero are ``mark records'', and
represent interruptions, missing data, or time discontinuities in
logging.
.SS pmValueSet
This subrecord represents the values for one metric at one point in time.
.TS
box,center;
c | c | c
r | r | l.
Offset	Length	Name
_
0	4	Performance Metrics Identifier (PMID)
4	4	number of values
8	4	value format, PM_VAL_INSITU=0 or PM_VAL_DPTR=1
12	M	pmValue #0
12+M	N	pmValue #1
12+M+N	...	...
.TE

.PP
The metadata describing metrics is found in the .meta file
where the entries are
.B not
timestamped, as the metadata is assumed to be
unchanging throughout an archive.
.SS pmValue
This subrecord represents one value for one instance of a metric
at one point in time.
It is a variant type, depending on the parent
.IR pmValueSet \&'s
value format
field.
This allows small numbers to be encoded compactly, but retain
flexibility for larger or variable length data to be stored later in the
.I pmResult
record in a
.I pmValueBlock
subrecord.
.TS
box,center;
c | c | c
r | r | lx.
Offset	Length	Name
_
0	4	T{
.TW
internal instance identifier (or PM_IN_NULL=-1 for singular metrics)
T}
4	4	value (INSITU) \fIor\fR
		offset in pmResult to our pmValueBlock (DPTR)
.TE

.PP
The metadata describing the instance domain for metrics is found in the .meta file.
Since the numeric mappings may change during the lifetime of the
logging session, it is important to match up the timestamp of the
measurement record with the corresponding instance domain record.
That is, the instance domain corresponding to a measurement at time T
is the instance domain observation for the metric's instance domain
with largest timestamp T' <= T.

.SS pmValueBlock
Instances of this subrecord are placed at the end of the
.IR pmValueSet ,
after all the
.IR pmValue
subrecords.
If (and only if) needed, they are padded at the end to the
next 32-bit boundary.
.TS
box,center;
c | c | c
r | r | l.
Offset	Length	Name
_
0	1	value type (same as \fIpmDesc.type\fR)
1	3	4 + N, the length of the subrecord
4	N	bytes that make up the raw value
4+N	0-3	padding (not included in the 4+N length field)
.TE

.PP
Note that for
.BR PM_TYPE_STRING ,
the length includes an explicit NULL terminator byte.
For
.BR PM_TYPE_EVENT ,
the value byte string is further structured.
Refer to
.BR PMDAEVENTARRAY (3)
for more information about how arrays of event records are packed
inside a
.I pmResult
container.

.SH METADATA FILE (.meta) RECORDS
After the archive label record, the metadata file contains
interleaved metric description records, timestamped instance domain
records, timestamped label records (for context, instance domain and
metric labels) and (help) text records.
Unlike the data volumes, these records are not forced to 32-bit
alignment.
.PP
For Version 2 the file size is limited to 2GiB, due to storage of 32-bit
byte offsets within the temporal index.
For Version 3 the file size is limited to 8191PiB, due to storage of 62-bit
byte offsets within the temporal index.
.PP
See also
.IR libpcp/src/logmeta.c .
.SS Metric Descriptions
Instances of this
.IR "" ( pmDesc )
record provide the description or metadata for
each metric appearing in the PCP archive.
This metadata includes the metric's
PMID, data type, data semantics, instance domain identifier (or
.B PM_INDOM_NULL
for singular metrics with only one value)
and a set of (1 or more) names.
.TS
box,center;
c | c | c
r | r | lx.
Offset	Length	Name
_
0	4	tag, TYPE_DESC=1
4	4	PMID
8	4	data type (PM_TYPE_*)
12	4	instance domain identifier
16	4	metric semantics (PM_SEM_*)
20	4	units: bit-packed pmUnits
4	4	T{
.ad l
.hy 0
number of alternative names for this PMID
T}
28	4	N: number of bytes in this name
32	N	T{
.ad l
.hy 0
bytes of the name, no NULL terminator nor padding
T}
32+N	4	N2: number of bytes in next name
36+N	N2	T{
.ad l
.hy 0
bytes of the name, no NULL terminator nor padding
T}
\&...	...	...
.TE

.SS Instance Domains
A set-valued metric is defined over an instance domain, which
consists of an instance domain identifier (will have already been mentioned
in a prior
.IR pmDesc
record),
a count of the number
of instances and a map that defines the association between internal
instance identifiers (integers) and external instance names (strings).
.PP
Because instance domains can change over time, the instance domain
also requires a timestamp, and the same instance domain can occur
multiple times within the .meta file.
The timestamps are used to search for the temporally correct instance
domain
when decoding
.IR pmResult
records from the archive data volumes,
or answering metadata queries against the instance domain.
.PP
The instance domain format differs markedly between Version 2 and Version 3.
Version 3 provides enhanced timestamps
(64-bit encoding of the seconds part and nanosecond precision) and
introduces a new ``delta'' instance domain format that encodes
differences between the previous observation of the instance domain
and the current state of the instance domain.

.TS
box,center;
c   s   s
c | c | c
r | r | l.
Full Instance Domain \- Version 2
_
Offset	Length	Name
_
0	4	tag, TYPE_INDOM_V2=2
4	4	timestamp, seconds part (past UNIX epoch)
8	4	timestamp, microseconds part
12	4	instance domain number
16	4	N: number of instances in domain, normally >0
20	4	first instance number
24	4	second instance number (if appropriate)
\&...	...	...
20+4*N	4	first offset into string table (see below)
20+4*N+4	4	second offset into string table (etc.)
\&...	...	...
20+8*N	M	base of string table, containing
		packed, NULL-terminated instance names
.TE

.TS
box,center;
c   s   s
c | c | c
r | r | l.
Full Instance Domain \- Version 3
_
Offset	Length	Name
_
0	4	tag, TYPE_INDOM=5
4	8	timestamp, seconds part (past UNIX epoch)
12	4	timestamp, nanoseconds part
16	4	instance domain number
20	4	N: number of instances in domain, normally >0
24	4	first instance number
28	4	second instance number (if appropriate)
\&...	...	...
24+4*N	4	first offset into string table (see below)
24+4*N+4	4	second offset into string table (etc.)
\&...	...	...
24+8*N	M	base of string table, containing
		packed, NULL-terminated instance names
.TE

.PP
The ``delta'' instance domain record in Version 3 uses the same
physical structure as the ``full'' instance domain above with the
following differences:
.PD 0
.IP * 3n
The tag is TYPE_INDOM_DELTA=6.
.IP * 3n
The ``number of instances in domain'' field becomes the sum of the
number of instances added and the number of instances deleted.
.IP * 3n
.B Deleted
instances are encoded with the string offset
set to -1 and there is no corresponding string table entry.
.IP * 3n
.B Added
instances are encoded exactly the same way.
.PD

.PP
The ``delta'' instance domain format is used to provide a more
compact on-disk encoding for instance domains that have a large
number of instances and are subject to frequent small changes,
e.g. the instance domain of process ids, as exported by
.BR pmdaproc (1).

.PP
For ``full'' instance domain records the instance domain
\fIreplace\fR the previous instance domain: prior
records are not searched for instance domain metadata queries
after this timestamp.

.PP
Each instance domain in a Version 3 archive must have an initial ``full''
instance domain record.
Subsequent records for the same instance domain can be the `full'' or
the ``delta'' variant.
Any instance mentioned in the prior observation of an instance domain
that is not mentioned in the ``delta'' instance domain record is assumed
to continue to exist for the current observation of the instance domain.

.SS Labels for Contexts, Instance Domains and Metrics
Instances of this
.IR "" ( pmLogLabelSet )
record provide sets of label-name:label-value pairs
associated with labels of the context, instance domains and
individual performance metrics \- refer to
.BR pmLookupLabels (3)
for further details.
.PP
Any instance domain identifier will have already been mentioned
in a prior
.IR pmDesc
record.
.PP
As new labels can appear during an archiving session, these
records are timestamped and must be searched when decoding
.IR pmResult
records from the archive data volumes.
The
.IR pmLogLabelSet
format differs between Version 2 and Version 3,
with the latter providing enhanced timestamps
(64-bit encoding of the seconds part and nanosecond precision).

.TS
box,center;
c   s   s
c | c | c
r | r | lx.
Version 2
_
Offset	Length	Name
_
0	4	T{
tag, TYPE_LABEL_V2=3
T}
4	4	T{
timestamp, seconds part (past UNIX epoch)
T}
8	4	T{
timestamp, microseconds part
T}
12	4	T{
label type (PM_LABEL_* type macros.)
T}
16	4	T{
.TW
numeric identifier - domain, PMID, etc or PM_IN_NULL=-1 for context labels
T}
20	4	T{
.TW
N: number of label sets in this record, usually 1 except in the case of instances
T}
24	4	T{
offset to the start of the JSONB labels string
T}
28	L1	first T{
labelset array entry (see below)
T}
\&...	...	...
28+L1	LN	N-th T{
labelset array entry (see below)
T}
\&...	...	...
28+L1+...LN	M	T{
concatenated JSONB strings for all labelsets
T}
.TE

.TS
box,center;
c   s   s
c | c | c
r | r | lx.
Version 3
_
Offset	Length	Name
_
0	4	T{
tag, TYPE_LABEL=7
T}
4	8	T{
timestamp, seconds part (past UNIX epoch)
T}
12	4	T{
timestamp, nanoseconds part
T}
16	4	T{
label type (PM_LABEL_* type macros.)
T}
20	4	T{
.TW
numeric identifier - domain, PMID, etc or PM_IN_NULL=-1 for context labels
T}
24	4	T{
.TW
N: number of label sets in this record, usually 1 except in the case of instances
T}
28	4	T{
offset to the start of the JSONB labels string
T}
32	L1	T{
first labelset array entry (see below)
T}
\&...	...	...
32+L1	LN	T{
N-th labelset array entry (see below)
T}
\&...	...	...
32+L1+...LN	M	T{
concatenated JSONB strings for all labelsets
T}
.TE

.PP
Records of this form \fIreplace\fR the existing labels for a given
label type: prior records are not searched for resolving that class of
label in measurements after this timestamp.
.PP
The individual labelset array entries are variable length, depending
on the number of labels present within that set.
These entries contain the instance identifiers (in the case of type
.B PM_LABEL_INSTANCES
labels), lengths and offsets of each label name
and value, and also any flags set for each label.
.TS
box,center;
c | c | c
r | r | l.
Offset	Length	Name
_
0	4	instance identifier (or PM_IN_NULL=-1)
4	4	length of JSONB label string
8	4	N: number of labels in this labelset
12	2	first label name offset
14	1	first label name length
15	1	first label flags (e.g. optionality)
16	2	first label value offset
18	2	first label value length
20	2	second label name offset (if appropriate)
\&...	...	...
.TE

.SS Help Text
This
.IR "" ( pmLogText )
record stores help text associated with a metric or an
instance domain \- as provided by
.BR pmLookupText (3)
and
.BR pmLookupInDomText (3).
.PP
The metric identifier and instance domain identifier will have
already been mentioned in a prior
.IR pmDesc
record.

.TS
box,center;
c | c | c
r | r | l.
Offset	Length	Name
_
0	4	tag, TYPE_TEXT=4
4	4	text and identifier type (PM_TEXT_* macros.)
8	4	numeric identifier - PMID or instance domain
12	M	help text string, arbitrary text
.TE

.SH INDEX FILE (.index) RECORDS
After the archive label record, the temporal index file contains a
plainly concatenated, unframed group of tuples, which relate timestamps
to the byte offsets in the volume and .meta files.
These records are fixed size, fixed format, and are \fInot\fR enclosed
in the standard length/payload/length wrapper: they take up the entire
remainder of the .index file after the archive label record.
.PP
The temporal index file provides
a rapid way of seeking to a particular point of time within an archive for
both the performance metric values and the associated metadata.
.PP
See also
.IR libpcp/src/logutil.c .
.PP
The index format differs between Version 2 and Version 3,
with the latter providing enhanced timestamps
(64-bit encoding of the seconds part and nanosecond precision)
and 64-bit byte offsets.

.TS
box,center;
c   s   s
r | r | c
r | r | l.
Version 2
_
Offset	Length	Name
_
0	4	timestamp, seconds part (past UNIX epoch)
4	4	timestamp, microseconds part
8	4	archive volume number (0...N)
12	4	byte offset in .meta file
16	4	byte offset in archive volume file
.TE

.TS
box,center;
c   s   s
r | r | c
r | r | l.
Version 3
_
Offset	Length	Name
_
0	8	timestamp, seconds part (past UNIX epoch)
8	4	timestamp, nanoseconds part
12	4	archive volume number (0...N)
16	8	byte offset in .meta file
24	8	byte offset in archive volume file
.TE

.PP
Since the temporal index is optional, and exists only to speed up
time-based random access to metrics and their metadata, the index
records are emitted only intermittently.
An archive reader program should not presume any particular rate of
data flow into the index.
However, common events that may trigger a new temporal index record
include changes in instance domains, switching over to a new archive
volume, and starting or stopping logging.
One reliable invariant however is that, for each index entry, there
are to be no meta or archive volume records with a timestamp after
that in the index, but physically before the
associated byte offset in the index.

.SH FILES
Several PCP tools create archives in standard locations:
.PP
.PD 0
.TP 10
.B $HOME/.pcp/pmlogger
default location for the interactive chart recording mode in
.BR pmchart (1)
.TP 10
.B $PCP_LOG_DIR/pmlogger
default location for
.BR pmlogger_daily (1)
and
.BR pmlogger_check (1)
scripts
.PD
.SH SEE ALSO
.BR mkaf (1),
.BR PCPIntro (1),
.BR pmafm (1),
.BR pmchart (1),
.BR pmdaproc (1),
.BR pmlogdump (1),
.BR pmlogger (1),
.BR pmlogger_check (1),
.BR pmlogger_daily (1),
.BR pmlogreduce (1),
.BR pmlogrewrite (1),
.BR pmlogsummary (1),
.BR LOGIMPORT (3),
.BR PMAPI (3),
.BR pmLookupDesc (3),
.BR pmLookupInDom (3),
.BR pmLookupInDomText (3),
.BR pmLookupLabels (3),
.BR pmLookupName (3),
.BR pmLookupText (3),
.BR pcp.conf (5)
and
.BR pcp.env (5).

.\" control lines for scripts/man-spell
.\" +ok+ PM_LOG_FEATURE_XXX INSITU endian
.\" +ok+ subrecords subsampled labelsets
.\" +ok+ pmLogText pmLogLabelSet {from "record" types described here}
.\" +ok+ subrecord myarchive labelset unframed
.\" +ok+ AEDT DPTR NOP PiB src LN
.\" +ok+ __pmLog {from __pmLog* types in ...}
.\" +ok+ logmeta {from libpcp/src/logmeta.c}
.\" +ok+ logutil {from libpcp/src/logutil.c}
.\" +ok+ PM_SEM_ {from PM_SEM_*} PM_TEXT_ {from PM_TEXT_*}
.\" +ok+ PM_LABEL_ {from PM_LABEL_*}
